Advances in audio source seperation and multisource audio content retrieval

نویسنده

  • Emmanuel Vincent
چکیده

Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Features for Content-Based Audio Retrieval

Today, a large number of audio features exists in audio retrieval for different purposes, such as automatic speech recognition, music information retrieval, audio segmentation, and environmental sound retrieval. The goal of this paper is to review latest research in the context of audio feature extraction and to give an application-independent overview of the most important existing techniques....

متن کامل

معیارهای ارزیابی و تولید کتاب‌های گویا از دیدگاه تولیدکنندگان: تحلیل محتوای کیفی

Purpose: Audio books have a special stand in the publishing industry. Publishers around the world produce audio books with different criterions and standards. This study aimed to identify and introduce the most important criterions for evaluation and production of audio books from the producers' point of view. Methodology: this study was performed with qualitative content analysis of interview...

متن کامل

Content-based Retrieval in MIDI and Audio

This paper briefly reports on recent advances in the development of search engines for music and audio within the MiDiLiB project: while notify! is a tool for searching in symbolic music data like MIDI, audentify! allows to search in PCM audio.

متن کامل

Vodcast: A Breakthrough in Developing Incidental Vocabulary Learning

Incidental vocabulary learning is often seen as superior to direct instruction on many occasions. Meanwhile, upon the emergence of the World Wide Web, second language (SL) learners have been introduced to 'podcasts' (recorded audio and video online broadcasts) which could be authentic sources of vocabulary learning. The relatively recent phenomenon of video podcast (vodcast) might be considered...

متن کامل

pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis

Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012